An Optimization of Backup Storage using Backup History and Cache Knowledge in reducing Data Fragmentation for In_line deduplication in Distributed

نویسندگان

  • Abhijit Goswami
  • Nitin Shivale
چکیده

The chunks of data that are generated after the backup are physically distributed after deduplication in backup system, which creates a problem know as fragmentation. Basically fragmentation basically comes into sparse and outof-order containers. The sparse container adversely affect the performance while restoring the database and garbage collection effectively , while the out-of-order container brings an adverse effect on the performance issue if the restore cache built is small. To overcome this fragmentation problem , we propose a method of History-Aware Rewriting algorithm (HAR) and also Cache-Aware Filter (CAF).HAR will gather the historical information in backup systems to define, identify and reduce sparse containers, and CAF acknowledges restore cache knowledge to find the out-of-order containers that impacts restore performance. CAF supports HAR in datasets where out-of-order containers are prominent. To get rid of metadata of the garbage collection, we exploit Container-Marker Algorithm (CMA) to gather valid containers instead of valid chunks. My output helps to prove how HAR significantly improves the restore

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

In-line Deduplication for Cloud storage to Reduce Fragmentation by using Historical Knowledge

Recovery and Backup system in which the process involves that copying and archiving of data on different cloud server, so that this data is used to recover the unique data, afterward a loss event. Purpose of backup is to recover data after its loss and to improve data from a past time. In backup systems, the fragments of every data file are physically distributed over multiple servers, which in...

متن کامل

Survey on Fragmentation for Deduplication in Backup Storage

In backup environments field deduplication yields major advantages. Deduplication is process of automatic elimination of duplicate data in a storage system and it is most effective technique to reduce storage costs. Deduplication effects predictably in data fragmentation, because logically continuous data is spread across many disk locations. Fragmentation mainly caused by duplicates from previ...

متن کامل

Improving restore speed for backup systems that use inline chunk-based deduplication

Slow restoration due to chunk fragmentation is a serious problem facing inline chunk-based data deduplication systems: restore speeds for the most recent backup can drop orders of magnitude over the lifetime of a system. We study three techniques—increasing cache size, container capping, and using a forward assembly area— for alleviating this problem. Container capping is an ingest-time operati...

متن کامل

A Lookahead Read Cache: Improving Read Performance of Deduplication Storage for Backup Applications

Abstract—Data deduplication (for short, dedupe) is a special data compression technique and has been widely adopted especially in backup storage systems with the primary aims of backup time saving as well as storage saving. Thus, most of the traditional dedupe research has focused more on the write performance improvement during the dedupe process while very little effort has been made at read ...

متن کامل

Survey on Data Deduplication for Cloud Storage to Reduce Fragmentation

Data Deduplication is an important technique which provides better result to store more information with less space. Cost and maintenance of Information backup storage system for major enterprises can be minimized by storing it on Cloud Storage. Data redundancy between different kinds of data storage gets minimal by utilizing data deduplication method. By giving each application differently and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016